Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available February 1, 2026
-
Abstract Finding similarities between model parameters across different catchments has proved to be challenging. Existing approaches struggle due to catchment heterogeneity and non‐linear dynamics. In particular, attempts to correlate catchment attributes with hydrological responses have failed due to interdependencies among variables and consequent equifinality. Machine Learning (ML), particularly the Long Short‐Term Memory (LSTM) approach, has demonstrated strong predictive and spatial regionalization performance. However, understanding the nature of the regionalization relationships remains difficult. This study proposes a novel approach to partially decouple learning the representation of (a) catchment dynamics by using theHydroLSTMarchitecture and (b) spatial regionalization relationships by using aRandom Forest(RF) clustering approach to learn the relationships between the catchment attributes and dynamics. This coupled approach, calledRegional HydroLSTM, learns a representation of “potential streamflow” using a single cell‐state, while the output gate corrects it to correspond to the temporal context of the current hydrologic regime. RF clusters mediate the relationship between catchment attributes and dynamics, allowing identification of spatially consistent hydrological regions, thereby providing insight into the factors driving spatial and temporal hydrological variability. Results suggest that by combining complementary architectures, we can enhance the interpretability of regional machine learning models in hydrology, offering a new perspective on the “catchment classification” problem. We conclude that an improved understanding of the underlying nature of hydrologic systems can be achieved by careful design of ML architectures to target the specific things we are seeking to learn from the data.more » « lessFree, publicly-accessible full text available August 1, 2026
-
Several studies have demonstrated the ability of long short-term memory (LSTM) machine-learning-based modeling to outperform traditional spatially lumped process-based modeling approaches for streamflow prediction. However, due mainly to the structural complexity of the LSTM network (which includes gating operations and sequential processing of the data), difficulties can arise when interpreting the internal processes and weights in the model. Here, we propose and test a modification of LSTM architecture that is calibrated in a manner that is analogous to a hydrological system. Our architecture, called “HydroLSTM”, simulates the sequential updating of the Markovian storage while the gating operation has access to historical information. Specifically, we modify how data are fed to the new representation to facilitate simultaneous access to past lagged inputs and consolidated information, which explicitly acknowledges the importance of trends and patterns in the data. We compare the performance of the HydroLSTM and LSTM architectures using data from 10 hydro-climatically varied catchments. We further examine how the new architecture exploits the information in lagged inputs, for 588 catchments across the USA. The HydroLSTM-based models require fewer cell states to obtain similar performance to their LSTM-based counterparts. Further, the weight patterns associated with lagged input variables are interpretable and consistent with regional hydroclimatic characteristics (snowmelt-dominated, recent rainfall-dominated, and historical rainfall-dominated). These findings illustrate how the hydrological interpretability of LSTM-based models can be enhanced by appropriate architectural modifications that are physically and conceptually consistent with our understanding of the system.more » « less
-
Increases in evapotranspiration (ET) from global warming are decreasing streamflow in headwater basins worldwide. However, these streamflow losses do not occur uniformly due to complex topography. To better understand the heterogeneity of streamflow loss, we use the Budyko shape parameter (ω) as a diagnostic tool. We fit ω to 37-year of hydrologic simulation output in the Upper Colorado River Basin (UCRB), an important headwater basin in the US. We split the UCRB into two categories: peak watersheds with high elevation and steep slopes, and valley watersheds with lower elevation and gradual slopes. Our results demonstrate a relationship between streamflow loss and ω. The valley watersheds with greater streamflow loss have ω higher than 3.1, while the peak watersheds with less streamflow loss have an average ω of 1.3. This work highlights the use of ω as an indicator of streamflow loss and could be generalized to other headwater basin systems.more » « less
-
Abstract. High-resolution, spatially distributed process-based (PB) simulators are widely employed in the study of complex catchment processes and their responses to a changing climate. However, calibrating these PB simulators using observed data remains a significant challenge due to several persistent issues, including the following: (1) intractability stemming from the computational demands and complex responses of simulators, which renders infeasible calculation of the conditional probability of parameters and data, and (2) uncertainty stemming from the choice of simplified representations of complex natural hydrologic processes. Here, we demonstrate how simulation-based inference (SBI) can help address both of these challenges with respect to parameter estimation. SBI uses a learned mapping between the parameter space and observed data to estimate parameters for the generation of calibrated simulations. To demonstrate the potential of SBI in hydrologic modeling, we conduct a set of synthetic experiments to infer two common physical parameters – Manning's coefficient and hydraulic conductivity – using a representation of a snowmelt-dominated catchment in Colorado, USA. We introduce novel deep-learning (DL) components to the SBI approach, including an “emulator” as a surrogate for the PB simulator to rapidly explore parameter responses. We also employ a density-based neural network to represent the joint probability of parameters and data without strong assumptions about its functional form. While addressing intractability, we also show that, if the simulator does not represent the system under study well enough, SBI can yield unreliable parameter estimates. Approaches to adopting the SBI framework for cases in which multiple simulator(s) may be adequate are introduced using a performance-weighting approach. The synthetic experiments presented here test the performance of SBI, using the relationship between the surrogate and PB simulators as a proxy for the real case.more » « less
-
Abstract Water table depth (WTD) has a substantial impact on the connection between groundwater dynamics and land surface processes. Due to the scarcity of WTD observations, physically‐based groundwater models are growing in their ability to map WTD at large scales; however, they are still challenged to represent simulated WTD compared to well observations. In this study, we develop a purely data‐driven approach to estimating WTD at continental scale. We apply a random forest (RF) model to estimate WTD over most of the contiguous United States (CONUS) based on available WTD observations. The estimated WTD are in good agreement with well observations, with a Pearson correlation coefficient (r) of 0.96 (0.81 during testing), a Nash‐Sutcliffe efficiency (NSE) of 0.93 (0.65 during testing), and a root mean square error (RMSE) of 6.87 m (15.31 m during testing). The location of each grid cell is rated as the most important feature in estimating WTD over most of the CONUS, which might be a surrogate for spatial information. In addition, the uncertainty of the RF model is quantified using quantile regression forests. High uncertainties are generally associated with locations having a shallow WTD. Our study demonstrates that the RF model can produce reasonable WTD estimates over most of the CONUS, providing an alternative to physics‐based modeling for modeling large‐scale freshwater resources. Since the CONUS covers many different hydrologic regimes, the RF model trained for the CONUS may be transferrable to other regions with a similar hydrologic regime and limited observations.more » « less
An official website of the United States government
